People come into contact with enormous amounts of information over the course of their private and professional lives. Much of this information is in printed form. People often wish to retain printed information they have encountered that is of personal and/or professional interest. Retaining convenient access to the enormous amount of printed information one encounters has typically involved the storage of large numbers of books, magazines, folders, files and so on, and complex and inconvenient cataloguing schemes.
Today, much information that is available in printed form is also available in electronic form. News, articles, books, essays, opinions, reviews, analysis, and other content that one may encounter in newspapers, magazines, and books, may often now be located through use of the Internet. However, for many reasons, people still tend to interact with a great deal of printed content.
Various schemes for storing, locating, and retrieving electronic content corresponding to printed content have been attempted. For example, U.S. Pat. Nos. 6,671,684 and 6,658,151 describe techniques for retrieving a stored electronic image of a document using a page of the document. However, these techniques require large amounts of scanned information from the documents, and are not readily applied when the documents are not available as images. Furthermore, these techniques do not leverage the enormous and expanding capabilities of commercial Internet search engines.
U.S. Pat. No. 6,226,631 exemplifies another approach of the past. This patent describes a process whereby a portion of a display electronic image may be selected (using, for example, a mouse), converted to text, and submitted to a search engine. However, the described approach requires the availability and display of an electronic image of the document, e.g. the document is already identified and displayed in electronic image format. This approach does not provide for identifying an electronic document corresponding to a rendered document (because the document is already identified and displayed), using a small amount of information scanned from the document, nor is it suitable for cataloguing the large amounts of printed information that people may encounter on a routine basis.